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^N| ^ Abstract 

^-^' Let LA„{t) be the length of the longest alternating subsequence of a random permutation r £ [n]. Classical 

C^ , probabilistic arguments are provided to derive the asymptotic mean, variance and limiting law of LA„ (r). Our 

methodology is robust enough to tackle similar problems for finite alphabet random words or even Markovian 

sequences. 
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Let a := (01,02, . . . ,o„) be a sequence of length n whose elements belong to a totally ordered set A. Given an 
increasing set of indices {€i}™i, we say that the subsequence (a^jjO^j, . . . ,0^^) is alternating if ai-^ > ag^ < ai^ > 
■ ■ ■ ai^. The length of the longest alternating subsequence is then defined as 



^ ' LA„(o) ;= max{7Ti : o has an alternating subsequence of length m] . 

m ■ 

0^ ■ We revisit, here, the problem of finding the asymptotic behavior (in mean, variance and limiting law) of the length 

of the longest alternating subsequence in the context of random permutations and random words. For random permu- 
tations, these problems have seen complete solutions with contributions independently given (in alphabetical order) 

^J} \ by Pemantle, Stanley and Widom. The reader will find in [lOj a comprehensive survey, with precise bibliography 

and credits, on these and related problems. In the context of random words, Mansour [5] contains very recent 
contributions. Let us just say that, to date, the proofs developed to solve these problems are of a combinatorial or 
analytic nature and that we wish below to provide probabilistic ones. Our approach is developed via iid sequences 
uniformly distributed on [0, 1], counting minima and maxima and the central limit theorem for 2-dependent random 
variables. Our arguments work as well to study the asymptotic behavior of the longest alternating subsequence of 



H \ a random word a £ A", where ^ is a finite ordered alphabet. Finally we use a purely Markovian approach to get 

similar results for words generated by a Markov sequence. 



2 Random permutations 

The asymptotic behavior of the length of the longest alternating subsequence has been studied by several authors, 
including Pemantle [TUl page 684], Stanley [5] and Widom [T^], who by a mixture of function generating methods 
and saddle point techniques get the following result: 
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Theorem 2.1 iei t, be a uniform random permutation in the symmetric group 5„, and let LA„(t) be the length of 
the longest alternating subsequence of t . Then, 

E [LA„ (r)] 

Var [LA„ (r)] 

Moreover, as n ^ oo, 

LA„ (x) - 2n/3 _^ ^ 

^8n/45 
where Z is a standard normal random variable and where, as usual, => denotes convergence in distribution. 

This section is devoted to give a simpler probabilistic proof of the above resuh. To provide such a proof we make 
use of a weU known correspondence which translates the problem into that of counting the maxima of a sequence 
of iid random variables uniformly distributed on [0,1]. In order to establish the weak limit result, a central limit 
theorem ?7T,-dependent random variables is briefly recalled. 

Let us recall some well known facts (Durrett [2] Chapter 1], Resnick [71 Chapter 4]). For each n > 1 (including 
n — oo), let fin be the uniform measure on [0, 1] and, for each n > 1, let the function T!„ : [0, 1]" — >■ 5„ be defined by 
T„(ai,a2, . . . ,a„) = r^^, where r is the unique permutation r S 5„ that satisfies a^ < ar^ < ■ ■ ■ < ar„- Note that 
Tn is defined for all a S [0, 1]" except for those for which Oi = aj for some i ^ j, and this set has /i„ -measure zero. A 
well known fact, sometimes attributed to Renyi [7], asserts that the pushforward measure T„^„, i.e., the image of /i„ 
by Tn, corresponds to the uniform measure on iS„, which we denote by z/„. The importance of this fact relies in the 
observation that the map T„ is order preserving, that is, Oi < aj if and only if {Tna)^ < {Tno)-. This implies that 
any event in iS„ has a canonical representative in [0, 1]" in terms of the order relation of its components. Explicitly, 
if we consider the language L of the formulas with no quantifiers, one variable, (say, x) and with atoms of the form 
Xi < Xj, i,j G [n], then any event of the form {x : tp {x)} where (p G L, has the same probability in [0, 1]" and in 5„ 
under the uniform measure. To give some examples, events like {x : x has an increasing subsequence of length k}, 
{x : X avoids the permutation a}, {x : x has an alternating subsequence of length k} have the same probability in 
[0, 1] and Sn- In particular, it should be clear that 

LA„(x)=LA„(a), (1) 

where r is a uniform random permutation in Sn and a is a uniform random sequence in [0, 1] . 

Maxima and minima. Next, we say that the sequence a = (ai, 02, ... , a„) has a local maximum at the index k if 
(i) flfe > Ok+i or k — n, and (ii) Ok > at-i or fc = 1. Similarly, we say that a has a local minimum at the index k if 
(i) Qk < Ofe+i or fc = n, and (ii) Ofc < Ok-i- An observation that comes in handy is the fact that counting the length 
of the longest alternating subsequence is equivalent to counting maxima and minima of the sequence (starting with 
a local minimum). This is attributed to Bona in Stanley [10| : for completeness, we prove it next. 

Proposition 2.2 For fj,n-almost all sequences a — (01,02, . . . ,a„) £ [0, 1]", 

LA„(a) = # local maxima of a + =ff local minima of a (2) 

n-l 

= 1 (a„ > a„_i) + 2 • 1 (ai > 02) + 2J2^ (ofc-i < ak > Ok+i) ■ (3) 

k=l 

Proof. For /i„-almost all a G [0, 1]", Oi 7^ Oj whenever i 7^ j, therefore we can assume that a has no repeated 
components. Let ti,...,tr be the positions, in increasing order, of the local maxima of the sequence a, and let 
si, . . . , Sr' be the positions, in increasing order, of the local minima of a, not including the local minima before the 
position ti. Notice that the maxima and minima are alternating, that is, ti < Si < i^+i for every i, implying that 
r' = r or r' = r — 1. Also notice, that in case r' — r ~ \, necessarily tj. — n. Therefore, because (aj^ , a^j , Of^ , a^^ , ■ ■ ■) 
is an alternating subsequence of a, we have LA„(a) > r + r' = ^ local maxima +# local minima. 



To establish the opposite inequahty, take a maximal sequence of indices {^i}™i such that (a^J™! is alternating. 
Move every odd index upward, following the gradient of a (the direction, left or right, in which the sequence a 
increases), till it reaches a local maximum of a. Next, move every even index downward, following the gradient 
of a (the direction, left or right, in which the sequence a decreases), till it reaches a local minimum of a. Notice, 
importantly, that this sequence of motions preserves the order relation between the indices, therefore the resulting 
sequence of indices {^^Ij^i is still increasing and, in addition, it is a subsequence of (ii, si, ^2, 52, • • •)■ Now, because 
the sequence yai'^ . is alternating, it follows that LAn{a) < # local maxima +# local minima. 

Finally, associating every local maxima not in the n^^ position with the closest local minima to its right, we 
obtain a one to one correspondence, which leads to ([3]). D 

Mean and variance. The above correspondence allows us to easily compute the mean and the variance of the 
length of the longest alternating subsequence by going 'back and forth' between [0, 1]" and 5„. For instance, given 
a random uniform sequence a — (ai,...,a„) G [0,1]", let Mk '■— l(a has a local maximum at the index fc), 
/ce {2,...,n- 1}. Then 

E [Mk] = Mn(afe-i < ttk > ttk+i) = Ai3(ai < 0,2 > 03) = v^iri < T2 > T3), 

where again, i/„ is the uniform measure on iS„, n > 1. The event, {ti < T2 > T3} corresponds to the permutations 
{132, 231}, which shows that E [Mk] = 1/3. 
Similarly, 

E [Ml] ^ V2{ti > T2) = 1/2 and E [Af„] = j/2(ti < T2) = 1/2. 

Plugging these values in ([3]), we get that 

^. A / ^ 2?i 1 

ELA„ r =^ + ^- 
3 6 

To compute the variance of LA„(t), first notice that Gov (M^, Af^+r) = whenever r > 3, and that E [AffeAffe+i] = 
0. Now, going again back and forth between [0, 1]" and iS„, we also obtain 

E [ArfeA4+2] = t;5(Ti < T2 > r3 < r4 > rg) = 2/15, E [A/iAr3] = v^{tx > T2 < T3 > n) = 1/6 

and 

E [Ar„_2M«] = v^in <T2>T3< n) = 1/6. 

This implies from Proposition 12.21 and ([T]), that 

VarLA„(r) = --— . 

Asymptotic normality. A collection of random variables {Xi}^^^ is called m-dependent if Xt+m+i is independent 
of {A^ijj^x for every i > 1. For such sequences the strong law of large numbers extends in a straightforward manner 
just partitioning the summand in appropriate sums of independent random variables, but the extension of the central 
limit theorem to this context is less trivial (although a 'small block' - 'big block' argument will do the job). For this 
purpose we recall the following particular case of a theorem due to Hoeffding and Robbins [3^ (which can be also 
found in standard texts such as Durrett [2] Chapter 7] or Resnick [7^ Chapter 8]). 

Theorem 2.3 Let (Xt)^^^ be a sequence of identical distributed r-dependent bounded random variables. Then 

(Xi + ---+X„)-nEAi ^ ^ 

where Z is a standard normal random variable, and the variance term is given by 

7' = Var(Ai) + 2'£ Cov iXi,Xt) . 

t=2 



Now, let a = (ai, a2, . . .) be a sequence of iid Unif ([0, 1]) random variables, and let a*^") = (ai, . . . a„) be the 
restriction of the sequence a to the first n indices. Recalling ([T]) and Proposition [221 it is clear that if r is a uniform 
random permutation in 5„, 

, n — 1 

LA„(t) = 1 [a„ > a„_i] + 2 • 1 [ai > aa] + 2 X; 1 [ofe-i < ak > Ofc+i] , (4) 

k=2 

where — denotes equality in distribution. Therefore, since the random variables {1 [afe_i < ak > dk+i] '■ k > 2} are 
identically distributed and 2-dependent, we have by the strong law of large numbers that with probability one 

l"-i 1 

lim - X) 1 [ak-i < ak> a^+i] = Ms (ai < 02 > 03) == - . 

Therefore, from ([4]), we get that, in probability, 



1 2 

lim -LA„(t) = - . 

n->-oo 71 J 

More generally, applying the above central limit theorem, we have as n — > cx) 

^^"(-L-^"/^^A^(0,l), (5) 

where in our case, the variance term is given by 

^2 ^ var (2 • 1 [ai < 02 > as]) + 2 Gov (2 • 1 [ai < 02 > 03] , 2 • 1 [a2 < 03 > 14]) 
+ 2 Gov (2 • 1 [ai < a2 > as] , 2 • 1 [03 < 04 > as]) 
_ _8_ 
"45' 

from the computations carried out in the previous paragraph. 

Remark 2.4 The above approach via m-dependence has another advantage, it provides using standard m-dependent 
probabilistic statements various types of results on LA„(t) such as, for example, the exact fluctutation theory via 
the law of iterated logarithm. In our setting, it gives: 

,. LA„(r)-ELA„(T) 4 

iim sup ■ — 



y/n log log n 3%/5' 

^. . .LA„(t)-ELA„(t) 4 

hmmf ■ — 



>oo y/n log log n 3V5 

Besides the LIL, other types of probabilistic statements on LA„(t) are possible, e.g., exponential inequalities, large 
deviations, etc. This types of statements are also true in the settings of our next sections. 



3 Finite alphabet random words 

Consider a (finite) random sequence a — (ai,a2, . . . ,a„) with distribution /i'"', where /i is a probability measure 
supported on a finite set [q] — {!,... ,q}. Our goal now is to study the length of the longest alternating subsequence 
of the random sequence a. This new situation differs from the previous one mainly in that the sequence can have 
repeated values, and in order to check if a point is a maximum or a minimum, it is not enough to 'look at' its 
nearest neighbors, losing the advantage of the 2-dependence that we had in the previous case. However, Instead, we 
can use the stationarity of the property 'being a local maximum' with respect to some extended sequence to study 
the asymptotic behaviour of LA„ (a). As a matter of notation, we will use generically, the expression LA„ (/x) for 



the distribution of the length of the longest alternating subsequence of a sequence a = (ai, 02, • ■ • , otn) having the 
product distribution /i*-"'. 

In this section we proceed more or less along the lines of the previous section, relating the counting of maxima 
to the length of the longest alternating subsequence and then, through mixing and ergodicity, obtain results on the 
asymptotic mean, variance, convergence of averages and asymptotic normality of the longest alternating subsequence. 
These results are presented in Theorem 13. II f Convergence in probability), and Theorem [3]6] (Asymptotic normality). 

Counting maxima and minim,a. Given a sequence a — (ai, 02, . • . , a„) G [g] , we say that a has a local maximum 
at the index k, if (i) ak > Ofc+i or k — n, and if (ii) for some j < fc, aj < a^+i — ■ ■ ■ ak-i — ak or for all j < fc, 
aj = ak- Analogously, we say that a has a local minimum at the index fc, if (i) ak < ak+i or fc = n, and if (ii) for 
some j < k, aj > a^+i = • • ■ ak-i = ak- The identity ([2]) can be generalized, in a straightforward manner, to this 
context, so that 

LA„ (a) — # local maxima of a + # local minima of a 

n-l 

= 1 (a has a local maximum at n) + 2 ^ 1 (a has a local maximum at k) . 

k=l 

Now, the only difficulty in adapting the proof of Theorem 12.21 to our current framework is when moving in the 
direction of the gradient when trying to modify the alternating subsequence to consist of only maxima and minima. 
Indeed, we could get stuck in an index of gradient zero that is neither maximum nor minimum. But this difficulty 
can easily be overcome by just deciding to move to the right whenever we get in such a situation. We then end up 
with an alternating subsequence consisting of only maxima and minima through order preserving moves. 

Infinite bilateral sequences. More generally, given an infinite bilateral sequence a = (. . . , a_i, oq, ai, . . .) e [g] , 
we say that a has a local maximum at the index fc, if for some j < fc, aj < flj+i = ■ ■ ■ = ak > ak+i and, analogously, 
that a has a local minimum at the index fc, if for some j < k, aj > aj^i ^ ■ ■ ■ = ak < Ofc+i. Also, we define 
a^") = (fli, . . . , a„) to be the truncation of a to the first n positive indices. An important observation is the following: 
Set 

Ak = {a G [q] : For some j < 0, aj > Oj+i = ■ ■ ■ = ak > flfc+i} , 

A'f. = {a G [qf : For some j < 0, a^ ^ a^+i = • • • = a^ < 0^+1} , 

and 

A'l = {a G [q] : For some j > 1, aj < aj+i = • • • = Ofc < ak+i} ■ 

Then, for any bilateral sequence a G [q] , we have 

1 ( a*-"^ has a local maximum at fc] = 1 (a has a local maximum at fc) + 1^^ (a) , if fc < n, 

and 

1 la'"' has a local maximum at n) = 1 (a has a local maximum at n) + 1a„ (a) + lA'„(a) + 1a;; (a). 

Hence, 

LAn{a^"') — 2^1 {a has a local maximum at fc) + _R„ (a) , (6) 



where the remainder term _R„ (a) is given by _R„ (a) := 2 ^ 1^^. (a) + 1 (a*^"-* has a local maximum at n) , and is such 



fe=i 



that \Rn (a)| < 3, since the sets {^fcl^^^ are pairwise disjoint. 

Stationarity. Define the function f : [q] — ^ M via / (a) :— 2-1 (a has a local maximum at the index 0). If 
T : [g] — >■ [q] is the (shift) transformation such that {Ta)^ = a^+i, and T^'^^ is the fc-th iterate of T, it is clear that 

n-l 

1 (a has a local maximum at fc) = foT^'^^a). With these notations, ^ becomes LA„(a^"') — ^ /oT('^'(a)+_R„ (a). 

fe=i 



In particular, if a is a random sequence with distribution ii^^\ and if T'^^^ f is short for / o T'^^^a) the following 
holds true: 

LA„ (a.) = E 7^^'V + i?n (a) • (7) 

The transformation T is measure preserving with respect to /i^^^ and, moreover, ergodic. Thus, by the classical 

n 

ergodic theorem (see, for example, (51 Chapter V]), as n — >• oo, i ^ T^'^^ f -^ E [/], where the convergence occurs 
almost surely and also in the mean. The limit can be easily computed: 

oo 

E[f] = '2Y, Prob (a-(fc+i) < fl-fe = • • • = Qq > ai) 

k=ax£[q\ 

where for a; € [q], px := M ({2;}), Lx := X) Py and Ux := I] Py. 

Oscillation. Given a probability distribution fj, supported on [g], we define the 'oscillation of /i at x\ as 
osCp(x) := (L^ + U^)/{Lx + Ux) and the total oscillation of the measure /i as Osc (/i) := ^ osCp(x)p2;. Interpreting 

2:6 [g] 

the results of the previous paragraph through ([7|), we conclude that 

Theorem 3.1 Let a = {ai)f^i be a sequence of iid random variables with common distribution fi supported on [q], 
and let hA„{fi) be the length of the longest alternating subsequence of a. Then, 

lim = Osc (/i) , in the mean. 

n— ^00 n 

In particular, ii fi a uniform distribution on [q], Osc (/i) — (2/3— l/Sq), and thus LA„ (/i) /n is concentrated 
around (2/3— 1/3?) both in the mean and in probability. We should mention here that Mansour ^6 , using generating 
function methods, obtained, for fi uniform, an explicit formula for ELA„(/x), which, of course, is asymptotically 
equivalent to (2/3 — l/3g) n. Moreover, from ([7]) it is not difficult to derive a nonasymptotic expression for E LA„ (/i): 

ELA„ (m) =nOsc{^l) + J:^^^^^Rl{x)p, + Exelq]R^(^)P"^ (8) 

where the terms Ri{x) and R2{x) are given by: 

Ri[x) ^ -— H ^-osCu(2;) and R2(x) ^ — 5-. 

Lx + Ux {Lx + Uxf '^ ' ' Lx + Ux {Lx + Uxf 

Applying ([8]) in the uniform case it is straightforward to recover Mansour's computation as given in |6]. 
As far as the asymptotic limit of Osc (/i) is concerned, we have the following bounds for a general ji. 

Proposition 3.2 Let p be a probability measure supported on the finite set [q], then 

\(l- Ep^)<Osc(m)<|(i- T.pI\- (9) 

Proof. Note that Yj ^xPx = YjPiPj == Z) ^xPx and Yj ^xPx + Y ^xPx + J2 pI = ^^ which implies that 

xe[q] i<j xe[q\ x(^[q\ xl^[q\ xe[q\ 

E LxPx = Y.UxPx = \[i- E Px ) • (10) 

x(i[q] xe[g] ^ \ xl^[q\ ) 



Similarly, for any permutation a £ S3, we have that ^ LxUxPx = X) PiiPi2Pi3 — S PiiPi2Pi3J which 

implies that 6 ^ LxUxPx — X) PiiPi2Pi3- Finally, an inclusion-exclusion argument leads to 

E P^iPt2Pt;i = 1 - 3 X PjiPj2 + 2 X PiiPi2Pi3 = 1-3XPx + 2EpL 

ilJ^i2J^i3 ii=i2 ii=i2 2;G[q] ^ebl 

and therefore 

E i.c^.p. = ^ - J E p' + ^ E pi (11) 

a:G[g] " xe[q] "^ x£[q\ 

Now, to obtain the upper bound in ^, note that 

Osc (/i) = E $^P^ = E (i. + C^x)P. - 2 X y^:^P. (12) 

x^[q]^x-\- Ux xe[q] xe[q]^x + Ux 

SO that in particular, Osc (/x) < X i^^x + Ux)px ~ 2 X LxUxPx- Hence, using (ITUl) and (ITT|) . 

2:6 [g] 2;G[g] 

Osc(a.)<|(i- Ep^V 
For the lower bound, note that 4 X jt't^Px ^ E (-^^ + C4;)Pa:i and from ([T^ we get 

Osc(M)>i E (i. + c/.)p. = ^(i- Ep'V 

n 

An interesting problem would be to determine the distribution /i over \q\ that maximizes the oscillation. It is not 
hard to prove that such an optimal distribution should be symmetric about ((7 — 1) /2, but it is harder to establish 
its shape (at least asymptotically on q). 

Mixing. The use of ergodic properties to analyze the random variable LA„ (/i) goes beyond the mere application 
of the ergodic theorem. Indeed, the random variables jrC^)/ : fc g Z} introduced in the previous paragraph exhibit 
mixing, or 'long range' independence, meaning that as n — > cxd 

sup |Prob(A|B) -Prob(A)| ^-0, 

AGJ^>o:SeJ='<_„ 

where T>n (respectively J^<n) is the cr-ficld of events generated by JT^'')/ : A; > n} (respectively JT^'')/ : k < 71}). 
This kind of mixing condition is usually called uniformly strong mixing or ip-mixing , and the decreasing sequence 

ifin):^ sup |Prob(A|B)-Prob(^)|, (13) 

is called the rate of uniformly strong mixing (see, for example, fE', Chapter 1]). Below, Proposition 13.41 asserts that, 
in our case, such a rate decreases exponentially. Let us prove the following lemma first. 

Lemma 3.3 Let a — (ai)igz be a bilateral sequence of iid random variables with common distribution fi supported 
on [q]. If Cn,t corresponds to the event {a_„ — ■ ■ ■ — a_„+t_i ^ a^n+t}, then: 

(i) For any A e T>o and any t < n, the event Cn,t H A is independent of the a-field G<-n of events generated by 
{flj : i < -n}. 

(ii) Restricted to the event Cn,t, the a-fields J-yo and 5<-„ are independent. 



Proof. Define the event Br^s '■= {or < o-r+i = ■ ■ ■ = is > cts+i}- Then, for si < S2 < • • • < Sm, the formula 

m n 

Yl T^'^'' f = X] n XBr- 3 holds, where the sum runs over the ri, . . . , r„ such that Si_i < Vi < Si (letting sq ~ — oo). 
Now, since the random variables {T^^'f,i e Z| are binary, then for any A G J->o the random variable xa can be 



expressed as a linear combination of terms of the form Y[ T^'^'^ f, where < si < • • • < 



Sr. 



Next, xc„,. n T^''\f = Xc„, E D Xb.,„. = Xc„, En>-«+t-i 11 Xs..,.. , which implies that xc„, 11 T^^'V 

i=l V i=l / \ ' i=l / i=l 

and ^<-n are independent. This implies, in particular, the independence of the events Cn,t(^A and B, for any A G J^>o 
and B E G<-n, proving (i). The statement (ii) follows directly from (i). D 

Proposition 3.4 Let a — {ai)i^z be a bilateral sequence of iid random variables with common distribution fj, sup- 
ported on [q\. If the event A belongs to the a-field J-q, then 

||Prob(A|g<_„)-Prob(A)||^:= sup |Prob (A |B) - Prob(A)| < 2g/c", 

see<-„ 

where k := max/i ({a;}). In particular, the rate of uniform strong mixing of the sequence VT^^' f : fc G Z| (see (|13|) ). 

xe[q] 

satisfies (p (n) < 2qK"^^. 

Proof. Let A G J-q. By Lemma [?31 Prob {A n C„.,- \G<-n ) — Prob {A n Cn,r), whenever r < n. Therefore, 

n 

Prob {A |g<_„ ) = E Prob {A n C„,, |g<_„ ) + Prob (A n {a_„ = • • • = ao} |g<_„ ) 

n 

= J2 Prob (^ n Cn,r) + Prob (^ n {a_„ = ■ ■ • = ao} |g<_„ ) 

= Prob(A) + (Prob {A D {a_„ = • • • = ao} |a<-« ) - Prob {A D {a_„ = • • • = ao})) . 

Then, it follows: 

IJProb {A |g<_„ ) - Prob(v4)||^ < Prob (A n {a_„ = • • • = ao}) + ||Prob {A n {a_„ = • • • = ao} \g<-n)\\^ 

< 2 llProb (a_„ = • • ■ = ao |a<-„ )IL 
<2gK" 

where the last conclusion follows trivially from ^<-„ ^ -7^<-(n+i)- D 

Taking advantage of the mixing property we can now infer without major effort how the asymptotic variance 
behaves and also deduce the asymptotic normality of the LA„ (/i) statistic. This is done in the next two paragraphs. 

n 

Variance. The computation of the variance of the sequence Sn = "^2 T^ ' f is straightforward. Indeed 



Var (Sn) — ri 



n-l 

Cov( 






(/,/) + 2ECov(/,tW/ -2EfcCov(/,rW/ , (14) 

fe=l ^ ^J fc=l ^ ^ 



and the mixing property from Proposition 13.41 implies that ICov (/, T'^*'''/)! decreases geometrically in k, so that all 
the series involved in (1141) converge. Therefore, 

Var (^„) ^nj^ + (1) , where 7^ = Gov (/, /) + 2 E Gov (/, T^"^ fj . (15) 

Moreover, for k < I, iGov (l^(a),T'^'^)/) I < El^(a) < k', and for k > I, making use of Proposition 13. 4[ 
|Gov (lA(a),rW/)| < Aqn'^-^-'^'ElAia) < Aqn''-'^ . This implies that, as n ^ cx). 



_ ~1 n-l 

Gov( ET^'V, ElA.(a) 
fc=i fc=i 



<49^ E«' + E«' =0(1) 



Similarly, by making use of the Cauchy-Schwarz inequality, we have that Gov ( ^ T^'^^ f, 1^ (a) j — ^ where An is 



fe=i 



either one of the events An, A'^ or A'^^. Finally using the fact that Gov Y. T^'''> f,T^"'^ f = ^ Gov {f,T^'''>f) is 



k=l 



n-1 



fe=l 



n-1 



bounded as n ^- cx), we conclude that Gov I ^ T'-''^ f,R{n) j = 0(1), as n -^ oo. This implies the corresponding 

\k=l ) 

extension of p^ to LA„ (/i): 

Var (LA„ (^)) = 717^ + 0(1) as n -> 00. 

Note that the bound, just established, is not meaningless since the boundedness of i?„ (a) only guarantees the 
weaker estimate Var (LA (^)) = 717^ + O (n^/^). 

Now, let us proceed to compute 7^. Define the function /; ■.\q\ ^- M via /; (a) = 2-1 (a_; < a_;+i = • • • = ag > ai), 
00 
so that / (a) — J2fi (^)- Notice that 



;=i 



Gov 



(/,rW/, 



H = <^ 



if fc > Z + 2 
4 E (1^) (iyPi) i=.A,Px - 2 Osc (m) E Llp'y iik = l + l 

if i< fc < ; 



-2 0sc(m) ELlp'y 
yelq] 
4E4H-2 0sc(m) E^Pi 

ye [9] yGbl 



if = fc < Z, 



thus 7 is given by 



00 ex: / 

Var(/) + 2E E Cov(/,tW/, 

I 17— t._1 ^ 



fc=l/=A;-l 



Osc (Ai) 2 - 3 Osc (Ai) - 4 E 



Lrr 



P.] +8 E 






PxPy 



We further mention at this point that Mansour [6] has already obtained an exact expression for the variance when 
/i is the uniform distribution over [g], given (as it can be checked from (|15p ) by 



^ =45 



(l + l/g)(l~3/4g)(l-l/2g) 
(1 - l/2g) 



Asymptotic normality. Under appropriate conditions (say, asymptotic positive variance and fast enough mixing), 
it is natural to expect for the sequence of partial sums to be asymptotically normal. In our model, this is indeed the 
case. Let us recall the following central limit theorem which goes back to Volkonskii and Rozanov [Til Theorem 1.2] 
and which can be found in texts such as Bradley pj Theorem 10.3] in full generalization. 

Theorem 3.5 Let x — (xi)^,^ be a strictly stationary sequence of bounded random variables such that the sequence 
a{n):= sup |Prob (A n i?) — Prob(74) Prob(_B)| is summable (i.e. '^ a (n) < cxij, where J^>o is the a- 

AeJ^>o,-BeJ^<-„ n>\ 

field generated by the random variables {xi : i > 0} and J^<-„ is the a-field generated by the random variables 
{xi : i < —n}. Then, 

00 
1. 7^ := Var (ajo) + 2^ Gov(a;o,iC() exists in [0,oo), the sum being absolutely convergent. 



2. Moreover, if j'^ > 0, then as n -^ 00, 



J2xt- nExQ 



Vni 



Z, 



where Z is a standard normal random variable. 



Now, the asymptotic normality of LA„ (fi), nam.ely, the fact that as n — >■ c», 

(LA„(/x) -nOscjfi)) 

is clear: By Proposition l3.4| the mixing coefficients a{n) decrease geometrically, implying the summability of ^a (n). 

Theorem 3.6 Let a — (ai)"=i be a sequence of iid random variables, with common distribution fi supported on [q], 
and let LA„(/i) be the length of the longest alternating subsequence of a. Then, as n —^ cxd, 

(LA„(^)-nOsc(^)) _^ ^ 
where 2 is a standard normal random variable and 7 is given by (jlSp . 



4 Markovian words 

There is another framework allowing us to prove similar results in a more general setting. Consider now an ergodic 
Markov chain (a5fc)j.>o started at stationarity whose state space is a finite linearly ordered set A, so that without loss 
of generality, A = [gj. Our objective (as before), is to study the behavior of the statistics LA„ {xq, . . . ,Xn). 

Adding gradient information to the chain. Let us consider the related process {yk)k>o defined recursively as 
follows: 

- yo = 1- 

- t/fc+i = 1 if Xk+i >Xk or if Xk+i =Xk and yu = 1. 

- 2/fc+i = -1 if Xk+i <Xk or if Xk+i ^Xk and yk = -1. 

This new sequence basically carries the information on whether the sequence is increasing or decreasing at k (we 
define the sequence a;i,a;2, ■ • • to be increasing at k ii Xk > Xk-i or if it is increasing at fc — 1 and Xk = x^-i, 
analogously, the sequence is decreasing at k ii Xk < x^-i or if it is decreasing at fc — 1 and Xk ~ Xk-i). 

The following holds true for the process {xk,yk)k>o- 

Proposition 4.1 The process {xk,yk)f,^Q is Markov, with transition probabilities given by 

P{r,±l)^(sA) = Pr.s^ {s > r) , P(r,l)^{r,l) = Pr,r 
P{r.±l)^{s.-1) = Pr,sl (s < r) , P{r-l)^{r-l) = Pr,r 

and stationary measure given by 

'"'{r,!) = C^- Pr,r)~ J2'^sPs,r, 7r(,, _i) = (1 - Pr,r)~ J2'^sPs,r- 
s<r s>r 

Moreover, the Markov process (0;^, 2/fe-i '?/*;) i.>n ^'^^ '^ stationary measure given by 

V^ T^tPt.sPs.r v^ T^tPt.sPs.r 

t<s<r ^ Ps,s t>s>r ^ Ps,s 

■^tPt,sPs,r v^ T^tPt,sPs,r 



-. , 7r(r,-l.l) - L -. 

t<s>r -L ~ Ps,s t>s<r ^ ^ P. 
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Proof. The process is Markov since by definition j/j,^i G cr (aj^, aj^+i, y^.) and since (aJfe)j,>o is Markov. The 
transition probabilities are easily obtained from the definition and moreover, 

I]71'(r,l)P(r,l)^(M,l) + S'''(r,-l)P(r,-l)^(«,l) 
r r 

r<u t<r r<u t>r 

== Z) (1 ~-Pr,r) J2'^tPt,rPr,u + i^" Pu,u) J2 '^tPt,uPu,u 
r<u t^r t<u 

= Z) T^tPtM + (1 - Pum)~ Z T^tPtMPuM 

= ^(u4)- 
Similar computations show that 

J2^(r,l)P(rS)^(u,-l) + J2^(r,-l)P(r,-l)^{u,-l) = T^iu,-!), 
r r 

thus proving that 7r(„ j-x) is the stationary measure of (a;fe,yj,)j,^Q. 

For the chain [Xk,yk-i,yk) ^yi let us only verify one case, the other cases being similarly: 

r r 

V^ V^ T^tPt.sPs.r , v^ v^ TTtPt.sPs,r 

= E E -r——Pr.u + E E -r—^ Pr,u 

r<ut<s<r ^ ys,s r<ut>s<r ^ ys,s 

EPs,r ^—^ ^—^ Ps,r ^—^ ^—^ Ps,r ^—^ 

T^ Pr,u2^'n'tPt,.s+ E T^ Pr,u2^T^tPt,s+ E T^^ Pr^ul^T^tPt.s 

s<r<u^ Ps.,s t<s s<r<u^ Ps,s t>s s=r<u^ Ps,s t<s 

E, v^ '^sPs,rPr,uPr,r 
TTsPs,rPr,u + E 7^ 

s<r<« s<r<u ^ Pr.r 

n 

Oscillations in a Markov chain. Given an ergodic Markov chain x.= {xk)f.yi whose state space is a finite 
linearly ordered set, define Osc+ (a;) := E i^tPt,sPs,r)/{'i- - Ps,s), Osc" (a;) := E {'^tPt,sPs,r)/{i - Ps,s) and 

t<s>r t>s<r 

Osc (x) :— Osc^ (x) + Osc^ (a;) ( == 2 Osc^ (x) = 2 Osc^ (x) ). The following theorem holds: 

Theorem 4.2 Let LA„ (xq, . . . ,a;„) be the length of the longest alternating subsequence of the first n + 1 elements 
of the Markov chain (a;fe)^>Q. Then, as n ^^ oo, 



LA„ (a^o, • • • , a;„j 
> Osc yx) 



in the mean and almost surely. 
Proof. From the very definition of j/;,, 



n-l 

LA„ {xq, . . . , a;„) = El {VkVk+i = -l) 

fe=0 



therefore, by the ergodic theorem, 

LA„(a;o,...,a;„) 



n 



Prob^(yoyi = -1). 
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in the mean and almost surely. Now, from Proposition 14. 1[ 

T^tPt,sPs,r , v^ TrtPt,sPs,r 



Prob^(yoyi = -!)= E 1 + E -r 

t<s>r -L ~ Ps,s t>s<r ^ 



t>s<r ^ Ps,s 

from which the result follows. D 

Remark 4.3 The case pt.s — Ps (and therefore nt = pt), corresponds to iid letters thus recovering Theorem \3.1l 

Central limit theorem: In case the asymptotic variance term of LA„ (xi, . . . , a;„) is nonzero, then since LA„ (xi, . . . , Xn) 
is an additive functional of the finite Markov chain {xk, Vk-ij Vk) kx^' ^^'^ since the mixing rate of an ergodic markov 
chain with finite state space is exponentially decreasing, Theorem l3.5l implv that, for some 7 > 0, 

(LA„ (xq, . . . , Xn) - n Osc (a;)) 

where 2^ is a standard normal random variable. The reader should contrast this last fact with the increasing 
subsequence results where the iid and Markov limiting laws differ when the size of the alphabet is large enough (^4 ). 
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